Skip to content

SOLR-13309: Introduce FloatRangeField to expose Lucene 'FloatRange'#4229

Merged
gerlowskija merged 4 commits intoapache:mainfrom
gerlowskija:SOLR-13309-floatRangeField
Mar 24, 2026
Merged

SOLR-13309: Introduce FloatRangeField to expose Lucene 'FloatRange'#4229
gerlowskija merged 4 commits intoapache:mainfrom
gerlowskija:SOLR-13309-floatRangeField

Conversation

@gerlowskija
Copy link
Contributor

@gerlowskija gerlowskija commented Mar 20, 2026

Description

We recently added Solr field types "IntRangeField" and "LongRangeField" to allow users to index and search integer and long ranges (e.g. prices, business hours, grade ranges). But not everything is an int/long. Lucene has additional "range" types: DoubleRange, and FloatRange.

This PR tackles exposing FloatRange in Solr as FloatRangeField. This is ideal for things that may not fit as a standard "whole number".

Solution

This approach takes a very similar approach to that taken by #4192. AbstractNumericRangeFieldType contains most of the plumbing common to these "range" types, so the new implementation added by this PR FloatRangeField contains only that code that differs from its base/parent class.

The main difference between FloatRangeField and the pre-existing IntRangeField and LongRangeField is the value parsing, which accepts floating point values that (e.g. 3.14) that would be rejected as bounds in an IntRangeField.

Tests

FloatRangeFieldTest has unit tests for the new functionality, with more "integration" functionality tested in NumericRangeQParserPluginFloatTest. Both of these test classes borrow heavily from their int and long counterparts.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide
  • I have added a changelog entry for my change

This commit adds a new field type, FloatRangeField, that can be used to
hold singular or multi-dimensional (up to 4) ranges of floats.

FloatRangeField is compatible with the previously added
`{!numericRange}` and supports similar syntax.
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Mar 20, 2026
@gerlowskija gerlowskija marked this pull request as ready for review March 20, 2026 16:14
* Regex fragment matching a comma-separated list of signed floating-point numbers (integers or
* floating-point literals).
*/
protected static final String COMMA_DELIMITED_FP_NUMS =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] All of these regexes are a bit messy.

Ultimately the idea here is that we want the validation for each implementing type (int, float, etc.) to be specific to what that type looks like. IntRangeField needs to be able to reject fp-values, etc.

This validation is done by regex. The regexes live in this base class because some code here that invotes this verification. Individual sub-classes specify the regex pattern they want to use by using overrideable methods getRangePattern and getSingleBoundPattern.

So that's my rationale here. Open to other ways of doing it if folks can see a better approach....

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your current approach is already really clean IMO, since it keeps validation in one place.

I did consider a more type agnostic two-step approach: use just one regex to validate the overall[... TO ...]structure to reduce some of the regex complexity, then let parseFloatArray in FloatRangeField.java reject invalid values downstream. But mismatched-dimension edge cases would need separate handling, so that ends up splitting the validation rather than really simplifying it, which also doesn't align with your rationale.

  • One small cleanup idea: since RANGE_PATTERN_STR and FP_RANGE_PATTERN_STR share the same overall structure and mainly differ in the numeric regex fragment they embedded, would it be worth introducing a small helper like buildRangePattern(String numericFragment) to centralise the range pattern assembly, and maybe even the compilation step as well? That might reduce the duplication a little.

Copy link
Contributor

@HoustonPutman HoustonPutman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks great, the validation for floating point numbers could always be improved in the future.

protected static final String COMMA_DELIMITED_FP_NUMS =
"-?\\d+(?:\\.\\d+)?(?:\\s*,\\s*-?\\d+(?:\\.\\d+)?)*";

private static final String FP_RANGE_PATTERN_STR =
Copy link

@chan-dx chan-dx Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a small gap in the float validation regex: it doesn't accept scientific notation. I tested FP_RANGE_PATTERN_STR pattern on regex101.com, and an input like [1.1 TO 1.5e10] does not match.

Was the exclusion of scientific notation intentional? Or are values like this expected to be rejected before reaching this validation step?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was intentional, but I'm starting to second guess that decision now. My assumption at the time was that our existing float and double fields (e.g. DoublePointField) also didn't support scientific notation. There's no mention of that in the ref-guide afaict at least.

But in testing it just now I can see that they do actually support it.

Thanks for raising this - will fix.

@gerlowskija gerlowskija merged commit a38d81e into apache:main Mar 24, 2026
5 checks passed
@gerlowskija gerlowskija deleted the SOLR-13309-floatRangeField branch March 24, 2026 16:50
gerlowskija added a commit that referenced this pull request Mar 24, 2026
…4229)

Mirrors recently added 'IntRangeField' and 'LongRangeField' types.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:schema cat:search documentation Improvements or additions to documentation tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants